Goto

Collaborating Authors

 demystifying serverless ml training


ETH Zürich & Microsoft Study: Demystifying Serverless ML Training

#artificialintelligence

Serverless computing is a new type of cloud-based computation infrastructure initially developed for web microservices and IoT applications. As it frees model developers from concerns regarding capacity planning, configuration, management, maintenance, operating and scaling of containers, VMs and physical servers, serverless computing has gained popularity with machine learning (ML) researchers in recent years. Moreover, the benefits of serverless computing have also piqued interest in adopting it to data-intensive workloads such as ETL (extract, transform, load), query processing and ML, where it can provide significant cost reductions. Riding this trend, a research team from ETH Zürich and Microsoft recently conducted a systematic, comparative study of distributed ML training over serverless infrastructures (FaaS) and "serverful" infrastructures (IaaS), aiming to identify and understand the system tradeoffs involved in distributed ML training with serverless infrastructures. Serverless computing is offered by major cloud service providers such as AWS Lambda, Azure Functions and Google Cloud Functions.